A Wrapping Architecture for IR Systems to Mediate External Structured Document Sources
نویسندگان
چکیده
With Ihe growth of digital libraries and electronic publishing, many structured document sources are appearing and their efleclive mediation is an imporlanl research topic. In this paper, we propose a wrapping architecture for externally maintained struclured document sources. Our wrapping target is information retrieval systems (IRSs) that provide access to strucaured documenk We describe a wrapper construction method for such IRSs with limited functionality. A constructed wrapper enhances retneval facilities of Ihe underlying IRS and provides an object database view lo the mediator. We focus on determining whether the underlying IRS can support a given query. Then we discuss some research issues related to OUT wrapping architecture.
منابع مشابه
Semi-Structured Document Classification
INTRODUCTION Document classification developed over the last ten years, using techniques originating from the pattern recognition and machine learning communities. All these methods do operate on flat text representations where word occurrences are considered independents. The recent paper (Sebastiani, 2002) gives a very good survey on textual document classification. With the development of st...
متن کاملSemi-Automatic Wrapper Generation for Commercial Web Sources
Semi-automatic wrapper generation tools aim to ease the task of building structured views over semi-structured web sources. But the wrapper generation techniques presented up to date are unable to properly deal with sources requiring complex navigational sequences for accessing data. In this paper, we present Wargo, a semi-automatic wrapper generation tool, which has been used by non-programmer...
متن کاملA Proposed Architecture for Designing Integrated Information Systems in Research and Development Department of Universities of Medical Sciences: A Case Study of Ahvaz Jundishapur University of Medical Sciences
Background and Aim: This study aimed to propose a consistent architecture to design integrated and flexible information systems for the Vice-Chancellor for Research and Technology of Ahvaz Jundishapur University of Medical Sciences (AJUMS). Materials and Methods: This applied research employed an integrated design based on business system planning (BSP) and James Martin's model for the design...
متن کاملIntelligent Wrapping from PDF Documents
Wrapping is the process of navigating a data source, semiautomatically extracting data and transforming it into a form suitable for data processing applications. The semi-structured form of web pages, coupled with the availability of business-relevant data, has led to the availability of several established products on the market for wrapping data from the Web. One such approach is the Lixto me...
متن کامل5 Semi-structured Document Classification
Document classification developed over the last 10 years, using techniques originating from the pattern recognition and machine-learning communities. All these methods operate on flat text representations, where word occurrences are considered independents. The recent paper by Sebastiani (2002) gives a very good survey on textual document classification. With the development of structured textu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997